Global variance modeling on the log power spectrum of LSPs for HMM-based speech synthesis

نویسندگان

  • Zhen-Hua Ling
  • Yu Hu
  • Li-Rong Dai
چکیده

This paper presents a method to model the global variance (GV) of log power spectrums derived from the line spectral pairs (LSPs) in a sentence for HMM-based parametric speech synthesis. Different from the conventional GV method where the observations for GV model training are the variances of spectral parameters for each training sentence, our proposed method directly models the temporal variances of each frequency point in the spectral envelope reconstructed using LSPs. At synthesis stage, the likelihood function of trained GV model is integrated into the maximum likelihood parameter generation algorithm to alleviate the over-smoothing effect on the generated spectral structures. Experiment results show that the proposed method can outperform the conventional GV method when LSPs are used as the spectral parameters and improve the naturalness of synthetic speech significantly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tue.O5d.04 Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis

This paper utilizes global variance (GV) of the log power spectrum (LPS) derived from mel-cepstrum to improve hidden Markov model (HMM) based parametric speech synthesis. In order to alleviate over-smoothing of the generated spectral structures, an LPS-GV modeling method using line spectral pairs (LSPs) has been proposed in our previous work, where the estimated distribution of LPS-GV was combi...

متن کامل

Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis

This paper utilizes global variance (GV) of the log power spectrum (LPS) derived from mel-cepstrum to improve hidden Markov model (HMM) based parametric speech synthesis. In order to alleviate over-smoothing of the generated spectral structures, an LPS-GV modeling method using line spectral pairs (LSPs) has been proposed in our previous work, where the estimated distribution of LPS-GV was combi...

متن کامل

Minimum generation error training with direct log spectral distortion on LSPs for HMM-based speech synthesis

A minimum generation error (MGE) criterion had been proposed to solve the issues related to maximum likelihood (ML) based HMM training in HMM-based speech synthesis. In this paper, we improve the MGE criterion by imposing a log spectral distortion (LSD) instead of the Euclidean distance to define the generation error between the original and generated line spectral pair (LSP) coefficients. More...

متن کامل

An HMM-Based Mandarin Chinese Text-To-Speech System

In this paper we present our Hidden Markov Model (HMM)-based, Mandarin Chinese Text-to-Speech (TTS) system. Mandarin Chinese or Putonghua, “the common spoken language”, is a tone language where each of the 400 plus base syllables can have up to 5 different lexical tone patterns. Their segmental and supra-segmental information is first modeled by 3 corresponding HMMs, including: (1) spectral env...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010